home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Precision Software Appli…tions Silver Collection 1
/
Precision Software Applications Silver Collection Volume One (PSM) (1993).iso
/
tutor
/
asm1tut.exe
/
CHAP25.DOC
< prev
next >
Wrap
Text File
|
1990-07-31
|
24KB
|
487 lines
268
CHAPTER 25 - WHAT DOES IT ALL MEAN?
What does it all mean? Not a whole lot, actually. We can now save
memory space and we can save a lot of time. But with the
incessant march of technology these things mean less and less. A
few years ago, when 64k or 128k was a lot of memory and memory
was expensive, having a 20k program instead of a 40k program was
a significant advantage. Now it means almost nothing unless it is
a memory resident program. What about disk space? Just a while
back we were operating with two 360k floppy disks and a hard disk
was too expensive. Nowdays everyone has a 20meg hard disk.{1} And
speed? Those programs that were slow on an 8088 now seem o.k. on
an 80386. Those programs that were unbearably slow on the 8088
died a quick death and are no longer around.
Compilers are better and they have more subroutines available.
They are also easier to program than going to the assembler
level. What this chapter is about is when NOT to use the standard
compiler functions and subroutines.
First, you should understand that all compiler subroutines are
general purpose subroutines. They need to be all things to all
people. Imagine what a vehicle would be like if we gave the
designer the following specification:
We want to be able to drive to the store for groceries. It
should be fuel efficient. In case we want to go into the
mountains it should be an all terrain vehicle. We also want
to be able to haul a roomful of furniture from coast to
coast. Oh yes, and we want to be able to race it at Le Mans.
Being universal requires a lot of code and it slows things down.
Whether this extra code and time is too much is a question you
need to decide for yourself. First, here are some examples of
size. This is a C program that does almost nothing:
#include <stdio.h>
main()
{
int x ;
x = 27 ; /* line 1 */
scanf ( "%d", &x ) ; /* line 2 */
printf ( "%d\n", x ) ; /* line 3 */
}
____________________
1. Which has led to one of my pet peeves. All installation
programs for compilers and word processors dump EVERYTHING on the
hard disk. This gives us subdirectories that have 50 files in
them, and we don't have the foggiest notion of what any of the
files are for. If these installation programs would only prompt
us by type of file to find out what we want to install and want
to leave off the hard disk, we would all be better off.
______________________
The PC Assembler Tutor - Copyright (C) 1990 Chuck Nelson
Chapter 25 - What Does It All Mean? 269
___________________________________
I have made 3 programs from this. LINE1.C has line1 only. LINE2.C
has lines 1 and 2. LINE3.C has lines 1, 2 and 3. For those non C
people, scanf is an input function, printf is an output function.
Guess how big each program is. Here's the directory listing.
LINE1 EXE 3176 6-22-90 8:48a
LINE2 EXE 7170 6-22-90 8:49a
LINE3 EXE 9134 6-22-90 8:49a
It takes 3000 bytes to start a C program (this is the startup
module) 4000 bytes more to enter something and 2000 bytes extra
to print something. That first 3000 bytes is unavoidable if you
are writing in a high level language. If you are doing a lot of
general purpose i/o, these extra amounts aren't too bad. There
are two cases where you might want to use your own i/o routines.
First, if you have something simple or secondly, if you have
something special, you want to do your own i/o.
If you don't need all that flexibility, you are better off doing
your own i/o. Here are two files that write a text screen to a
disk file.
COPYSCRN EXE 10454 6-10-90 9:30a
INTSCRN COM 445 6-12-90 7:40p
They both do the same thing except that the .COM file is a little
more sophisticated. Notice the difference in size. Speed really
doesn't play a part here because what they do is so simple that
it takes just a second in any case. The program was so simple
that it only took an hour or two to write, so I didn't lose any
time by writing it in assembler.
The other case is when you have a specific idea of what the
screen should look like. You want control of the whole screen all
the time. This includes all word processors, databases,
programming environments, etc. They all take charge of the screen
because some DOS functions are too slow. If you remember from the
ZOOM chapter, there is a radical difference between what you can
do and what DOS can do. Even though these large programs are
written in C, they all bypass the C i/o functions. That does not
mean that they go down to the assembler level, however.
INTERRUPTS
You have done a few interrupts. They call the standard DOS or
BIOS functions. Remember, they do this by going into low memory
(the first 1k of memory) and getting the address of the
subprogram that handles that particular interrupt. However, you
do not need to call these interrupts from the assembler level.
All modern compilers support interrupt calls in the language. If
yours doesn't, you need a more recent compiler. Before going on
with this chapter you need to read your compiler documentation
about interrupts. TURBO Pascal has INTR, QuickBASIC and QuickC
have INT86. Read the documentation now.
The PC Assembler Tutor 270
______________________
Have you read it? No cheating is allowed, because you won't
understand the rest of this if you haven't read it.
Though technically C is a structure and Pascal is a record, they
are actually arrays where each array element has a specific name.
The interrupt routine reads all these values into the
corresponding register, calls the interrupt, then reads the
register values back into the array. Some languages have one
array for the input and another for the output. Int 21h is
special so QuickC has a special function called INTDOS. It is the
same as using Int 21h.
The order of registers in the array is arbitrary and language
dependent. For TurboPascal it is (AX, BX, CX, DX, BP, SI, DI, DS,
ES, FLAGS). You enter values in the registers specified by the
interrupt, and then call the interrupt. The routine does the
rest.
INTR ( int_no: byte, var the_regs: Registers)
This will push the interrupt number, then the array address. On
entry to the interrupt call and after initializing BP, we will
have:
int_no bp + 6
array_address bp + 4
old IP bp + 2
bp -> old BP bp
What follows is not the exact code, but is similar to what the
Pascal routine does:
; - - - - - - - - - -
intr proc near
push bp
mov bp, sp
push ax ; save all registers except SP, DS, SS, CS
push bx
push cx
push dx
push si
push di
push bp ; this is OUR bp
push es
; insert the interrupt number in the interrupt
mov al, [bp+6] ; AL now contains the interrupt number
lea si, interrupt_spot ; where the interrupt is
mov cs:[si+1], al ; insert it in the interrupt
; change all the registers
mov si, [bp+4] ; array address is DS:SI
mov ax, [si]
mov bx, [si+2]
mov cx, [si+4]
mov dx, [si+6]
mov bp, [si+8]
Chapter 25 - What Does It All Mean? 271
___________________________________
mov di, [si+12]
mov es, [si+16]
; special manipulation for DS and SI
push ds ; save ds
push si ; save si
push ax ; temp save of ax from array
mov ax, [si+14] ; ds from array to ax
mov si, [si+10] ; si from array to si
mov ds, ax ; now move ax to ds
pop ax ; restore ax
; call the interrupt
interupt_spot:
int 0 ; dummy number for the interrupt
; special needs for SI and DS
; our SI and DS are at the top of the stack
; save values of flags, si and ds from interrupt
pushf ; value from interrupt
push si ; value from interrupt
push ds ; value from interrupt
add sp, 6 ; get to our si and ds
pop si ; our si
pop ds ; our ds
sub sp, 10 ; sp is where it was a moment ago.
mov [si], ax ; DS:SI points to array
mov [si+2], bx
mov [si+4], cx
mov [si+6], dx
mov [si+8], bp
mov [si+12], di
mov [si+16], es
pop [si+14] ; ds from the interrupt
pop [si+10] ; si from the interrupt
pop [si+18] ; flags from the interrupt
add sp, 4 ; skip our DS and SI (already in regs)
pop es
pop bp ; this is OUR bp
pop di
pop si
pop dx
pop cx
pop bx
pop ax
mov sp, bp
pop bp
ret (4) ; clear arguments off the stack
; - - - - - - - - - - - - - - - - - - - -
This should test your insight into using code. DS and SI are
needed for moving data, so we use some kludges to get it to work.
There are two things here that you shouldn't normally do. First,
The PC Assembler Tutor 272
______________________
we are inserting the interrupt number directly in the machine
code. Secondly, we are playing around with the value of SP. These
are rare exceptions and shouldn't occur in your own code unless
absolutely necessary.
The first thing you are going to say is, "Gosh, that's a lot of
code for one interrupt." True, especially when the interrupt is
interrupt 12h. Here's int 12h inside of our template file:
; - - - - - START CODE HERE
int 12h ; machine memory (return in ax)
call print unsigned
; - - - - - END CODE HERE
It finds out how much memory your computer has and returns the
number of kbytes in AX. But how much extra time does using this
Pascal interrupt routine take? About 700 clocks or about .0002
seconds (that's right, 2 ten thousandths) on the slowest machine.
How many times will you call it during a program? Only one time.
There is no point in going down to the assembler level to write a
program that saves you .0002 seconds. In Pascal, you would write:
INTR ( $12 , the_regs) ;
and be done with it. No big loss of time and no trouble at all.
In fact, as far as I can see, there is no reason for doing any
interrupts from the assembler level. You may want to do a whole
subprogram that contains interrupts, but if you just need one or
two interrupts, it is easier to work from inside the high-level
language.
This includes the i/o we were talking about a minute ago. Yoy can
write a screen program inside a high level language using arrays.
Just think of a screen as a 80X25 array. If a two dimensional
array is too slow you need to go to a one dimensional array. All
interrupts that tell what kind of video card is in the computer,
what mode the screen is in, etc. can be done from the high-level
language. The most you need assembler for (depending on the
language) is moving the text array into video memory. You want a
bunch of help screens? Put all the help screens in a single file
and use the interrupt for random access file read to read a
screen when you need it.{2}
Anything else? Yes, we still have the need for speed. There are
certain types of operations like block moves of data, word
searches and sorting of arrays that are characterized by large
amounts of data and/or large amounts of computation. If you think
you see a way to use registers effectively for one of these
____________________
2. What you actually want to do is have the first block of
data in the file tell you where each screen is and how long its
data is. Then the first 2 bytes or words of the screen data
should say the dimensions of the screen data ( 12 X 25, 17 X 3,
etc.). This will allow you to store and use screens of any size.
Chapter 25 - What Does It All Mean? 273
___________________________________
things, you probably can beat a compiled version of the
subprogram. Then the only question is whether or not it is worth
the trouble.
We have used the words "fast" and "slow" ambiguously so far, but
now it is time to quantify them. Before you get the numbers, you
need to know one thing about memory. People always talk about the
"data bus". What is it? It is a group of wires connecting the
80x86 chip to memory. The 8088 has 8 wires, the 8086, 80286 and
80386/SX have 16 wires, and the 80386 has 32 wires. That means
that the 8088 can transfer 8 bits of information at one time, the
8086 et. al. can transfer 16 bits at a time and the 80386 can
transfer 32 bits at a time. This means one byte, two byte and
four byte transfers respectively. This also means that the memory
bytes are ordered a little differently. You will never notice it
externally, but here is the different internal ordering.
The 8088 has all bytes one after the other. All memory
read/writes are done with the same 8 wires:
8088
MEMORY ADDRESSES
00005
00004
00003
00002
00001
00000
data lines |||||||| (8 bits)
(All our examples will use absolute memory locations starting at
00000). The chips with a 16 bit data bus have all the even
locations on the first 8 wires and the odd locations on the other
8 wires. They come in pairs - first even then odd:
8086
MEMORY ADDRESSES
00006 00007
00004 00005
00002 00003
00000 00001
data lines |||||||| |||||||| (16 bits)
When one of these chips reads or writes, it can read/write either
the left or the right byte or the whole word. What it cannot do
is read the right byte from one pair along with the left byte
from another pair. If you want to read the word at 00005:00006,
the 8086 must:
1) read the 00005 byte.
2) read the 00006 byte.
3) join them together.
This takes longer than just a single word read.
The PC Assembler Tutor 274
______________________
The true 80386 has a 32 bit data bus. This allows it to read 4
bytes at a time, and its physical memory structure looks like
this:
80386
MEMORY ADDRESSES
00010 00011 00012 00013
0000C 0000D 0000E 0000F
00008 00009 0000A 0000B
00004 00005 00006 00007
00000 00001 00002 00003
data lines |||||||| |||||||| |||||||| |||||||| (32 bits)
Instead of memory pairs, we now have memory quadruplets. As long
as a word is totally inside of one quadruplet, the read/write
time will be unaffected. If the read/write crosses the boundary
(as we did above), the read/write time will be affected in the
same way. The 80386 can also read 4 byte data quickly as long as
the total data is inside of one memory quadruplet.
In the 8086 family, data can always be read across these
boundaries but it takes more time. (On the IBM 370, on the other
hand, there are instructions that REQUIRE that data be aligned
along 32 bit boundaries).
This means you should order your data in the following way in the
data segment:
QWORD DATA
DWORD DATA
TBYTE DATA ; this is for the 8087
WORD DATA
BYTE DATA ; all strings, etc.
This insures that any read/write for that type of data will
always be as fast as possible. If the segment definition has no
alignment type, it will start on a paragraph boundary - i.e.
every 16 bytes, and will work with anything. {3}
In addition, if you ever subtract a number from SP to provide for
a temporary data area, it should always be an even number. If SP
is at an odd address instead of an even address, it takes longer
for PUSHes and POPs. Also, when you define the size of the stack
segment, it should be an even number of bytes.
Having said that, it is now time for you to see the speeds of
____________________
3. The alignment type is a word after the word SEGMENT which
says how the segment should be aligned. The following:
DATASTUFF SEGMENT BYTE PUBLIC 'DATA'
says the segment can be aligned at any byte. The allowable forms
are BYTE, WORD, DWORD, PARA, PAGE (256 bytes). If there is no
explicit type, the default is PARA.
Chapter 25 - What Does It All Mean? 275
___________________________________
instructions. Read the introduction to APPENDIX III, then glance
at the times to get the general idea of how fast times are. Come
back to this chapter when you are comfortable with what the times
look like.
Have you read APPENDIX III? If not, do it before going on.
The compiled languages all have one thing in common. They tell
you that if you are writing a subroutine, you need to return from
the subroutine with DS, BP, SS, and SP unchanged. They don't say
a thing about any of the other registers. One thing this tells us
is that they are doing everything from memory locations, not
register locations. If you have taken a good look at the
execution times, you will have noticed the phenomenal difference
in time between a "memory, register" addition and a "register,
register" addition.
Now, if all you are going to do is move a number to a register,
add it, and move it out again, a compiler can do it as fast as
you can. But, if you run into a situation where you can use three
or four registers at the same time, you can cut the execution
time drastically. Compilers really can't use registers as
efficiently as we can (yet). This is an ideal spot for using
assembly language.
The old adage that 10% of the code uses 90% of the computer time
is appropriate here. You now know about assembler language, and
you know what you want to do with it, so go out and enjoy. But
before you do, try to slog your way through the next chapter on
"simplified" segment definitions and linking to high level
languages.